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Abstract 

The aim of this paper is to prove coding theorems for the wiretap channel and secret key agreement based 
on the the notion of a hash property for an ensemble of functions. These theorems imply that codes using sparse 
matrices can achieve the optimal rate. Furthermore, fixed-rate universal coding theorems for a wiretap channel 
and a secret key agreement are also proved. 
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I. Introduction 

The aim of this paper is to prove the coding theorems for the wiretap channel (Fig. 1) introduced in [23] and 
secret key agreement problem (Fig. 2) introduced in [12][1]. The proof of theorems is based on the notion of a 
hash property for an ensemble of functions introduced in [18] [19]. This notion provides a sufficient condition for 
the achievability of coding theorems. Since an ensemble of sparse matrices has a hash property, we can construct 
codes by using sparse matrices where the rate of codes is close to the optimal rate. In the construction of codes, 
we use minimum-divergence encoding, maximum-likelihood decoding, and minimum-entropy decoding, where 
we can use the approximation methods introduced in [9] [5] to realize these operations. 

Wiretap channel coding using a sparse matrices is studied in [21] for binary erasure wiretap channels. On the 
other hand, our construction can be applied to any stationary memoryless channel. It should be noted here that 
the encoder design is based on the standard channel code presented in [ 14] [ 1 8] [ 1 9] [ 1 3] . Furthermore, we prove 
the fixed-rate universal coding theorem for a wiretap channel, where our construction is reliable and secure for 
any channel under some conditions specified by the encoding rate. Universality is not considered in [23][21]. 
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Theory Workshop (ITW2009), Taormina, Italy, pp. 105-109, 2009. This paper is submitted to IEEE Transactions on Information Theory, 
Feb. 2010. 



April 12, 2010 



DRAFT 



2 



M- 



Sender 



Receiver 



X 



Hyzix 



Y 
Z 



-M 



Eavesdropper 



Capacity = max [I(X; Y) - I(X; Z)} 

X,X: 
X^X^YZ 



Fig. 1. Wiretap Channel Coding 
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Fig. 2. Secret Key Agreement from Correlated Source Outputs 



The secret key agreement from correlated source outputs using sparse matrices is studied in [15][16], where 
both non-universal and universal codes are considered. Our construction is the same as that proposed in [16]. 
It should be noted that the linearity of functions is not assumed in our proof of reliability and security while it 
is assumed in [16]. Furthermore, an expurgated ensemble of sparse matrices is not assumed in our proof while 
it is assumed in [16]. 

II. Definitions and Notations 
Throughout this paper, we use the following definitions and notations. 

The cardinality of a set U is denoted by \IA\, U c denotes the compliment of U, and U\V = U n V c . 

Column vectors and sequences are denoted in boldface. Let Au denote a value taken by a function A : 
U n -¥ U at u = (ui, ■ • ■ ,u n ) G U n , where U n is a domain of the function. It should be noted that A may 
be nonlinear. When A is a linear function expressed by an I x n matrix, we assume that U is a finite field 
and the range of functions is defined by U =U l . It should be noted that this assumption is not essential for 
general (nonlinear) functions because discussion is not changed if Hog \U\ is replaced by log \U\. For a set A 
of functions, let Im.4 be defined as 

InU = |J {Au : u E U n ). 

AeA 

We define sets Ca{c), Cab(c, m), and C AB g(c, m, w) as 



C A (c) 




Au = 


c} 




C A B{c,m) 


= {u 


Au = 


c, Bu 


= m} 


C ABS (c,m,w) 




Au = 


c, Bu 


= m, Bu = w} 
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In the context of linear codes, Ca(c) is call a coset determined by c. 

Let p and p' be probability distributions and let q and q' be conditional probability distributions. Then entropy 
H(p), conditional entropy H(q\p), divergence D(p\\p'), and conditional divergence D{q\\q'\p) are defined as 

H(q\p) = l( u \ v )P( v ) lo S 3^73 

u,v ^ ' ' 

d{j>\\p')^y,p^)^ u} 



p'(u) 



D(q || q'\p) EE £p(«) £ q(u\v) log -^g-, 

q \u\v i 

V u 

where we assume the base 2 of the logarithm. 

Let puv be the joint probability distribution of random variables U and V. Let pu and ^iy be the respective 
marginal distributions and p,u\v be the conditional probability distribution. Then the entropy H(U), the 
conditional entropy H(U\V), and the mutual information I(U; V) of random variables are defined as 

H(U) = H{pL V ) 
H{U\V) = Hfauwlnv) 
I(U;V) = H(U) - H(U\V). 

Let v u and v u \ v be defined as 

, x \{l<i<n:ui = u}\ 
n 

i | s v uv (u,v) 

V U \v{u\v) = -— . 

v v {v) 

We call v u a type of u € U n and v u i v a conditional type. Let U e ^ be the type of a sequence and 
U\V = vjj\v be the conditional type of a sequence given a sequence of type U. Then a set of typical sequences 
Tu and a set of conditionally typical sequences Tu\v{ v ) are defined as 

Tu = {u : v u = v v } 

Tu\ v {v) = {u : v u \ v = v u]v ) . 

The empirical entropy, the empirical conditional entropy, and empirical mutual information are defined as 

H{u) = H{u u ) 
H(u\v) = H{v u \ v \v v ) 
I{u-v) = H{u)-H{u\v). 
A set of typical sequences Tu, 7 and a set of conditionally typical sequences Tu\v,-y( v ) are defined as 

7c/, 7 = {u : D(v u \\p v ) < 7} 
Tu\v, 7 (v) = {u : D(v u \ v \\ij,u\v\vv) < 7} • 
We use several lemmas for the method of the types described in Appendix. 
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In the construction of codes, we use a minimum-divergence encoder 



9ABB^ m ^ w ) = ar § min D(v x ,\\nx), (1) 

x'eC ABS (c,m,w) 



a maximum-likelihood decoder 



and a minimum-entropy decoder 



g A (c\y) = arg max u X |y(a;'|y), (2) 

x'£Ca(c) 



g A (c\y) = arg min H(x'\y). (3) 

The minimum-divergence encoder assigns a message to a typical sequence as close as possible to the input 
distribution, where the typical sequence is in the coset determined by c. The time complexity of encoding and 
decoding is exponential with respect to the block length by using the exhaustive search. It should be noted 
that the linear programming method introduced [9] and [5] can be applied to these encoder and decoders by 
assuming that X = y = GF(2) and A, B, and B are linear functions, where the linear programming method 
may not find the integral solution. Details are described in Section VIII. It should be noted here that we do not 
discuss the performance of the linear programming methods in this paper. 
We define %(•) as 



1, if a = b 

0, if a ^ b 

1, ifa^b 
0, if a = b. 

Finally, we use the following definitions in Appendix. For 7, 7' > 0, we define 



X (a = b) = < 



X (a + b) 



_ \U\\og[n + l] 

*U = (4) 

n 



&(7) =7- y^lognS (5) 
&|v(7'l7) = 7' - v^log^^ + v^logM (6) 



It should be noted here that the product set U x V is denoted by UV when it appears in the subscript of these 
functions. We define h{6) for < 6 < 1 as 

h{6) = -eiog9- [l-0]log(l-6>). (8) 

We define | • |+ as 

9, if 6 > 0, 

= { (9) 

0, if e < 0. 
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III. (a, /3)-hash Property 

In the following, we review the notion of the hash property for an ensemble of functions, which is introduced 
in [18]. This provides a sufficient condition for coding theorems, where the linearity of functions is not assumed. 
We prove coding theorems based on this notion. 

Definition 1 ([18]): Let A = {Anj^Li be a sequence of sets such that A„ is a set of functions A : U n -> 
U An satisfying 

lim g |Im - 4 " 1 = 0. (HI) 

n— >oo n 

For a probability distribution pa, u on A n , we call a sequence (A,p A ) = {{A n ,PA,n)}%Li an ensemble. Then, 
(A,p A ) has an (cxa, (3 A )-hash property if there are two sequences a. a = {aA(n)}^ =1 and j3 A = {/3a(^)}5£Li 
such that 

lim a A (n) = 1 (H2) 
lim p A (n) = (H3) 

and 

J2 PA,n ({A : Au = An'}) < IT n T'i + l T 1j 7 >^ n ) + min{|Ti, \T'\}P A (n) (H4) 
™'eT' 

for any T, T' C U n . Throughout this paper, we omit dependence of A, pa, cha and /3a on n. 

In the following, we present two examples of ensembles that have a hash property. 
Example 1: In this example, we consider a universal class of hash functions introduced in [8]. A set A of 
functions A : U n — > U A is called a universal class of hash functions if 

| {A : Au = Au'} | < -M- 

for any it ^ it'. For example, the set of all functions on U n and the set of all linear functions A : U n — > U 1a 
are classes of universal hash functions (see [8]). When A is a universal class of hash functions and pa is the 
uniform distribution on A, we have 

£ pa ({A : Au = Au'}) < \TnT'\ + 
u'eT' 

This implies that (A,p A ) has a (1, 0)-hash property, where l(n) = 1 and 0(n) = for every n. 
Example 2: In this example, we consider a set of linear functions A : U n — > W'- 4 . It was discussed in the 
above example that the uniform distribution on the set of all linear functions has a (1, 0)-hash property. In the 
following, we introduce the ensemble of g-ary sparse matrices proposed in [18]. Let U = GF(q) and l A = nR 
for given < R < 1. We generate an l A x n matrix A with the following procedure, where at most r random 
nonzero elements are introduced in every row. 

1) Start from an all-zero matrix. 

2) For each i e {1, . . . , n}, repeat the following procedure t times: 

a) Choose (j, a) e {1, . . . ,l A } x GF(g) uniformly at random. 

b) Add a to the (j, i) component of A. 



April 12, 2010 



DRAFT 



6 



Let (A.,p A ) be an ensemble corresponding to the above procedure, where r = O(log^) is even. It is proved 
in [18, Theorem 2] that there is (a A ,(3 A ) sucri that (A.,p A ) has an (at A , (3 A ) -hash property. 

In the following, let A be a set of functions A : U n — > t/.^ and assume that pc is the uniform distribution 
on Im.4, and random variables A and C are mutually independent, that is, 



p c (c) = < 



if c e Im.4 



|Im.4| • 

0, if ceU A \ ImA 



Pac(A,c) =p A {A)p c {c) 

for any A and c. We have the following lemmas, where it is not necessary to assume the linearity of functions. 
Lemma 1 ([18, Lemma 1]): If (A,p A ) satisfies (H4), then 

PA ({A : \Q \ {u}} n C A {Au) ? 0}) < j^L + p A 

for all G <zU n and all ueU n . 

Lemma 2 ([18, Lemma 2]): If (A,p A ) satisfies (H4), then 

p AC ({(A, c) : T n C A (c) =%})<a A -l+ ± 1] 

for all T 7^ 0. 

Finally, we consider the independent joint ensemble p A s of linear matrices. The following lemma asserts that 
it is sufficient to assume the hash property of (A., p A ) and (B, p B ) to satisfy the hash property of (A.X B, p AB ) 
when they are ensembles of linear matrices. 

Lemma 3 ([18, Lemma 7]): For two ensembles (A.,p A ) and (B,p B ), of l A x n and x n linear matrices, 
respectively, let p AB be the joint distribution defined as 

p AB (A, B)= PA (A)p B (B). 

Then (A. x B,p AB ) has an (a AB , /3 AB )-hash property for the ensemble of functions A © B : U n -> U 1a+1b 
defined as 

AffiB(u) = (Au,Bm), 

where 

aAfi(fi) = aA(n)as(n) 
^As(n) = P A (n) + p B (n). 

IV. Wiretap Channel Coding 

In this section we consider the wiretap channel coding problem illustrated in Fig. 1, where no common 
message and perfect secrecy are assumed. A wiretap channel is characterized by the conditional probability 
distribution [iyz\x^ where X, Y, and Z are random variables corresponding to the channel input of a sender, 
the channel output of a legitimate receiver and the channel output of an eavesdropper. Then the capacity 1 of 

1 It is stated in [20] that the auxiliary random variable can be eliminated by applying [10, Theorem 7] and [11, Theorem 3]. In fact, 
because of the authors misunderstanding about the result of [11, Theorem 3], the statement of [20] may not be true. They wish to thank 
Prof. Shamai (Shitz), Prof. Oohama, and Prof. Koga, for helpful discussions. 
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Fig. 3. Construction of Wiretap Channel Code 



this channel is derived in [7, Eq. (11)] as 

Capacity = max \l(X; Y) - I(X; Z)\ , (10) 

X,X: L J 
X^X^YZ 

where the maximum is taken over all probability distribution /j, xx anc ' tne J omt distribution H XXY z ^ s gi yen 
by 

VxXYz(x> X >y' Z ) = V>YZ\x(y,z\x)fJbx X (x,x). (11) 

If a channel between X and Y is more capable than a channel between X and Z, that is, 

I(X;Y) > I{X;Z) 

is satisfied for every input X, then the capacity of this channel is simplified as 

Capacity = max [I{X; Y) - I(X: Z)] , (12) 

where the maximum is taken over all random variables X and the joint distribution of random variable (X, Y, Z) 
is given by 

^XYz(x,y,z) = HYZ\x{y,z\x)Vx{x). (13) 

This capacity formula is derived in [23] for a degraded broadcast channel, extended in [7] to the case where a 
channel between X and Y is more capable than a channel between X and Z. 

In the following, we assume that fix and [iyz\x are gi yen , where it is not necessary to assume that a channel 
is degraded or a channel between X and Y is more capable than that between X and Z. We fix functions 

A : X n -> X lA 
B:X n ^ X le 
B : X n -> X l s 
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and a vector c e A"'- 4 available for an encoder, a decoder, and an eavesdropper, where 

n[H(X\Y)+e A ] 



Ib 



log 1*1 

log |*| 
n[7(X;F)-7(X;Z)] 



log|* 



n[/(X;Z)- £ g] 



as 



3 log |*| 

We construct a stochastic encoder and assume that the encoder uses a random sequence w <G * b, which is 
generated uniformly at random and independently of the channel and the message m e *' B . We define the 
encoder and the decoder 

if : X lB x *'e Af™ 

ip- 1 ■. y n X lB 

ip{m, w) = g AB g(c, m, w) 
= Bg A {c\y), 

where g AB ^(c 7 m,w) and g A (c\y) are defined by (1) and (2), respectively. It is noted that g ABB is a 
deterministic map. 

Let M and W be random variables corresponding to m and w, respectively, where the probability distribu- 
tions pm and pw are given by 

PM (m) ee { |Ime| (14) 
0, if m 4_ ImB 

if we ImB 

|Imo 

0, if w 4_ XmB 



P W (w) EE < 



(15) 



and the joint distribution pmwyz of the messages, and the channel outputs is given by 

PMWYz(m,w,y,z) = /J. Y z\x(y, z\<p(m, w))p M (m)p w (w). 
The rate of this code is given by 



n 

loe^ 

= I(X;Y)-I(X;Z)- |Img| 

n 

which converges to I(X;Y) — I(X; Z) as n goes to infinity by assuming the condition (HI) for an ensemble 
(B,pb)- The decoding error probability Error Y \x (A, B, B, c) is given by 

ErroYY\x(A,B 7 B,c) = ^ ^ Y \x(y\f{m, w))p M (m)p w (w)x{^ 1 (y) ^ m). (16) 

m.w.y 
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The information leakage Leakage z i x (A, B, B, c) is given by 

- N I(M: Z n ) 

Leakage zlx (A,B,B,c) = J '-. (17) 

It should be noted that the vector c is considered to be part of a deterministic map, which is known by the 
eavesdropper. 

We have the following theorem. It should be noted that alphabets X and y is allowed to be non-binary, and 
the channel is allowed to be asymmetric, non-degraded. 

Theorem 1: Let fj,yz\x be the conditional probability distribution of a stationary memoryless channel. For 
given I a and lg, assume that ensembles (A.,p A ), (A.x &,p AB ), and (A.x B x B,p ABB ) have a hash property. 
Then for any 5 > and all sufficiently large n, there are eg > e A > 0, functions (sparse matrices) A £ A, 
B e B, B e S, and a vector c e Im.4 such that 

Rate > /(X; Y) - I(X; Z) - 5 (18) 

Error Ylx {A,B,B,c) < 6 (19) 

Leakage Z | X (A, S, S, c) < <5. (20) 

By assuming that the channel between X and Y is more capable than that between X and Z, [ix attains the 
secrecy capacity defined by (12), and S — > 0, the rate of the proposed code is close to the secrecy capacity. 

For a general wiretap channel Hyz\x> l et F : X — > <Y be a channel (non-deterministic map) corresponding 
to a conditional probability distribution anc ' assume that 

achieves the maximum of the right hand side of (10). By using a proposed code for the channel H Y z\x defined 
as 

V Y z\x(y' z \x) = z\2tJ-YZ\x(y,z\x)fj, xlS (x\x) 

x 

with the input distribution fi x , we construct a code for the channel [iyz\x as 

ip(m,w) = F{g AB§ {c,m,w)) 
= Bg A (c\y), 

where g ABB {c 1 m, w) outputs the channel input x = (xi, . . . , x n ) G X n of outer channel H Y z\x> ^ 15 defined 

as 

Jf(2) = (F(Ji),...,F(J n )), 

and g A (c\y) reproduces x with small error probability. Then the rate of this code is close to the secrecy capacity 
of the channel [i Y z\x defined by (10). 

V. Universal Wiretap Channel Coding 

In this section we consider the fixed-rate universal wiretap channel coding for any stationary memoryless 
channel fi Y z\x> where an input distribution fix is given and it is enough to know the upper bound of H(X\Y) 
and the lower bound of I(X;Z) before constructing the code. It should be noted here that we have to know 
the sizes of X, y, and Z in advance. 
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For a given R^, Rb > 0, let pa and ps be ensembles of functions 



A 


: X n 


-> X A 


B 


: X n 


-> X lB 


B 


: X n 


-> X l s 



satisfying 

log \lmA\ 

Ra = 

n 

log |ImS| 



R B 
R 



n 

log |Im£>| 



° n 

respectively. It should be noted that Im£> represents the set of all messages, Rb represents the encoding rate 
of a confidential message. 

We fix functions A, B, B, and a vector c G A"'- 4 available for an encoder, a decoder, and an eavesdropper. 
We construct a stochastic encoder and assume that the encoder uses a random sequence w e X 1 ®, which is 
generated uniformly at random and independently of the channel and the message m e X 1b . We define the 
same encoder and decoder as defined in the last section except to replace and gA by Ija defined by (3). 

Let M and W be random variables corresponding to m and w, respectively, where the probability distribu- 
tions pm and pw are given by (14) and (15), respectively. The decoding error probability Errory|x(A B, B, c) 
and the information leakage Leakage Z | X (A, B, B, c) are given by (16) and (17), respectively. 

We have the following theorem. It should be noted that alphabets X and y is allowed to be non-binary, and 
the channel is allowed to be asymmetric. 

Theorem 2: For Ra, Rb, and Rg, Assume that ensembles (A.,p A ), (A.x B.Pab)^ and (A.x B x B,p AB g) 
have a hash property. Let \ix be the distribution of the channel input satisfying 

Ra + Rb + Rb<H{X), (21) 

where Rb represents the encoding rate of a confidential message. Then for any 6 > and all sufficiently large 
n, there are functions (sparse matrices) A e A, B e B, B e £>, and a vector c e Im„4 such that 

Erroiy {x (A, B, B,c) <5 (22) 

Leakage Z | X (A, B, B,c) <S (23) 
for any stationary memoryless channel fiyz\x satisfying 

R A > H{X\Y) (24) 

R S >I(X;Z). (25) 
Remark 1: It should be noted that (21), (24), and (25) imply 

< R A < H{X\Z) 
0<R B <I(X;Y)-I(X;Z). 
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VI. Secret Key Agreement from Correlated Source Outputs 

In this section we construct codes for secret key agreement from the correlated source outputs (X, Y, Z) 
introduced in [12] (see Fig. 2), where a sender, a receiver, and an eavesdropper have access to X, Y, and Z, 
respectively. The secret key capacity, which represents the optimal key generation rate, is given in [17] as 



Capacity = sup - I(X; Y) - I(X; Z, C\) 

n,t,(C\,X,Y) U 



(26) 



where the supremum is taken over all n, t, and protocols (C[,X, Y) satisfying Markov conditions 

Y n Z n C t i+1 XY X n C\- 1 d, if i is odd 
X n Z n Cf +1 XY Y n C\- x ^a, if i is even 
Y n Z n Y <-> X n C\ X 

X n z ny ^ Y n C[ <-> Y 

in which C\ represents the communication between the sender and the receiver via a public channel and finally 
the sender and the receiver generate X and Y, respectively. It should be noted that X ^ Y is allowed with 
high probability. According to [3] [4], there are three steps in a secret key agreement: advantage distillation, 
information reconciliation, and privacy amplification. This section deals with the combination of information 
reconciliation and privacy amplification studied in [1] [4] [15] [16]. In the following, we assume that a fixed joint 
distribution fixYZ satisfies 

I(X; Y) - I(X; Z) = H(X\Z) - H(X\Y) > 

and do not deal with advantage distillation. From (26), we can construct a protocol whose rate is close to the 
secret key capacity by combining an advantage distillation protocol (C[,X,Y) with the following one-way 
secret key agreement protocol, where I(X; Y) — I(X; Z, C{) is close to the secret key capacity. 

In the following, we focus on the one way secret key agreement protocol. When secret key agreement is 
allowed to be one-way from the sender to the receiver, the forward secret key capacity is given in [1] by 



Capacity = max I(X; Y\C) - I(X; Z\C) 
C,x L 



(27) 

where the maximum is taken over all random variables C and X that satisfy the Markov condition 

X < — > C < — > X i — > YZ. 

Since 

I(X; Y\C) - I(X- Z\C) = I(X; Y, C) - I(X; Z, C), 

then we can construct an optimal one-way secret key agreement protocol by applying the following protocol 
to the correlated source (X, (Y, C), (Z, C)), which achieves the maximum on the right hand side of (27). 
The following construction is based on [16]. We fix functions 

A : X n -> X lA 
B : X n -4- X lB 
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Fig. 4. Construction of One-way Secret Key Agreement Protocol 



available for an encoder, a decoder, and an eavesdropper, where 

n[H(X\Y)+e A ] 



Ib 



log |*| 
n[H(X\Z)-H(X\Y)] 
log |*| 
_ n[I{X-Y)-I{X-Z)] 



log |*| 

Then a secret key agreement protocol is described below (see Fig. 4). 

Encoding: Let x e *" be a sender's random sequence. The sender transmits c to a legitimate receiver 
via a public channel and generates a secret key by to, where c and to are defined as 

c=Ax (28) 
m= Bx, (29) 

respectively. 

Decoding: Let y E y n be a receiver's random sequence, and c = Ax be a codeword received from the 
sender via a public channel. The receiver generates a secret key by BgA{c\y), where is defined by (2). 

Let C and M be random variables corresponding to c and to defined by (28) and (29), respectively. The 
key generation rate is given by 

HIM) 

Rate=— ^ — '-. (30) 

n 

The error probability of the secret key agreement is given by 

Error X y (A, B) = ^ XY {{(x, y) : Bg A (Ax, y) £ Bx}) . (31) 
The information leakage is given by 

Leakage xyz (A, B) = 7 ( M ' Z "' C ) . (32) 

n 
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We have the following theorem. 

Theorem 3: For given I a and Is, assume that ensembles (A.,p A ) and (A. x B,p AB ) have a hash property. 
For all 8 > and sufficiently large n, there are ea > and functions (sparse matrices) A <G B and B e B 
such that the above secret key agreement protocol satisfies 

Rate > I(X; Y) - I(X; Z) - 5 (33) 

Ettot X y(A,B) < S (34) 

Leak&ge XY z ( A, B) < S. (35) 

By assuming that random variables C and X attain the forward secret key capacity given by (27) and the 
sender sends message C via public channel before the protocol, the rate of the proposed secret key agreement 
protocol for correlated sources (X, (Y, C), (Z, C)) is closed to the forward secret key capacity. 

VII. Universal Secret Key Agreement from Correlated Source Outputs 

In this section, we construct a fixed-rate universal secret key agreement scheme for any stationary memoryless 
sources (X, Y, Z), where it is enough to know the upper bound of H(X\Y) and the lower bound of H{X\Z) 
before constructing the code. It should be noted here that we have to know the sizes of X, y, and Z in advance. 

For a given R Al Rq > 0, let p A and ps be ensembles of functions 

A : X n -> X lA 
B : X n -> X lB , 

where 

_ nR A 

M 



log |*| 



log |*| 

We use the same secret key agreement protocol as that described in the last section except that we replace 
9A by qa defined by (3). 

The key generation rate Rate, the error probability Errors y (A, B), and the information leakage Lcakage(A, B) 
are defined by (30), (31), and (32), respectively. 
We have the following theorem. 

Theorem 4: For given Ra and Rg, assume that ensembles (A.,p A ) and (A.x B,p A b) have a hash property. 
For all 5 > and sufficiently large n, there are functions (sparse matrices) A e A and B e B such that the 
above secret key agreement scheme satisfies 

Rate > R B - 5 (36) 
Errors ( A, B) < 6 (37) 
Leakage xy z (A, B) < S. (38) 
for any stationary memoryless source (X, Y, Z) satisfying 

R A > H(X\Y) (39) 
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Ra + Rb < H{X\Z). (40) 
Remark 2: It should be noted that (39) and (40) imply 

< R A < H(X\Z) 
0<R B <I(X;Y)-I{X;Z). 

VIII. Applying Linear Programming Method to Minimum-divergence Encoder, 
Maximum-likelihood Decoder, and Minimum-entropy Decoder 

In this section, we apply the linear programming method introduced by [9] [5] by assuming that X = y = 
GF(2) and A, B, and B are linear functions (sparse matrices). It should be noted again that this method may 
not find integral solutions and we do not discuss the performance of the linear programming methods in this 
paper. 

First, we construct the minimum-divergence encoder g ABB defined by (1). The following construction is 
presented in [19]. We use the fact that the analysis of error probability in the proof of theorems is not changed 
if we replace the minimum-divergence encoder g ABB by 

x' if x' e C AB g(c, m,w) C\Tu exists, 
'error' otherwise, 
where U is defined by (76) which appears in Appendix C. Let 



i ABB 



(c, m, w) 



t = arg min D{v v (41) 

t'€{0,l,...,n} 

where (Vt(0), v t (X)) = (1 — t/n, t/n). Then the function g' ABB - is realized by finding x' that satisfies Ax' = c, 
Bx' = m, Bx' = w, and 2~27=i x 'i = * an( ^ declaring the encoding error if there is no such x' that satisfies 
Ax' = c, Bx' = m, Bx' = w, and X)"=i x i = wnere we consider x' as a real-valued vector in the 
third condition. It should be noted that it is realized by the linear programming method because the conditions 
Ax' = c, Bx' = m, Bx' = w can be represented by linear inequalities by using the technique of [9]. 

Next, we construct the maximum-likelihood decoder g A defined by (2). The following construction is 
equivalent to [9]. The function g A is realized by 



9A(c\y) 



arg^min =c ^4 if < Mx , y (l|0), M x|y(1|1) < 1/2 

i=i 

n 

arg min V[-lp^, if < Mx|y(1|0), Mx|y(0|1) < 1/2 

i=\ 
n 

arg max V[-l]^^, if < M x|y(0|0), < 1/2 

i=l 

n 

arg max ViS, if < Mx|f(0|0), h X \y(0\1) < 1/2, 



i=l 



where x' and y are considered as real-valued vectors in 2~2™=i x 'i an< ^ Y^i = \[^] Vi x 'i- The above minimizations 
and maximizations are the linear programming problems because the condition Ax' = c can be represented by 
linear inequalities by using the technique of [9]. 
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Finally, we construct the minimum-entropy decoder gA defined by (3). The following construction is presented 
in [19], which is based on the idea presented in [5]. The function g A can be realized as 



^t,min = arg min ^ j/jZ- 



Ax' —c 



Stmax = arg max ^ Vi x 'i 



Ax'=c i=1 

9A{c\y) = arg min H(x'\y), 

a;'GU" =0 {a:; timin ,x t>max } 



(42) 



(43) 



(44) 



where x 1 and y are considered as real- valued vectors in YH=i x 'i an< ^ Y^i=iVi x 'i- The derivation of (44) is 
presented in [19, Appendix A]. We can use the linear programming method to obtain x t , m i n and a3 t max because 
the constraint Ax' = c can be represented by linear inequalities by using the technique introduced in [9]. It 
should be noted that g A can be replaced by 



9a( c \v) = ar S _ H(x'\y) 

x'eC A (c)nTu 

= arg min H(x'\y) 

X ! £ {x± jm in j 33 f , max } 



(45) 



by assuming that U defined by (76) or t defined by (41) is shared by the encoder and the decoder, where a; tjm i n 
and x t , max are defined by (42) and (43), respectively. 

IX. Proof of Theorems 

A. Proof of Theorem 1 

We use the following lemma which is proved in Appendix. 
Lemma 4: Let <mb(c, m\z) be defined as 

g A B(c,m\z) = arg max Hx\z(x'\z). 

x'eCAB(c,m) 

Then, for all 5' > 0, all sufficiently small 7 > 0, and all sufficiently large n, there are functions (sparse 
matrices) A e A, B £ B, B £ B, and a vector c £ ImA such that 



Pmwyz 



(m,w,y,z) : 



<6'. 



(46) 



g AB g(c,m,w) i Tx a 
or z £ Tz\x n {9ABB^ m ^ w )) 
or 9 A bb( c > m > w ) ^ 9ab(c, m\z) 
VI or g ABS (c,m,w) j= g A {c\y) )) 

Now we prove Theorem 1. The equality (18) has already been shown. Since g AB g(c,m,w) — gA(c\y) 
implies 

= Bg A (c\y) 

= B 9 AB B^ m ^ W ) 



m 



for all c and w, the inequality (19) comes immediately from Lemma 4 by letting 5' < S. 
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In the following we prove (20). From Lemma 4 and Fano's inequality, we have 

H(g ABS (c, M, W)\Z n , M) < h(5') + nS' log \X\ 

for all S > and all sufficiently large n, where h is defined by (8). 
Let x = g AB §(c, m, w), X n = g ABB (c, M, W) and X be defined as 

Then the probability distribution P xz is given by 

P xz^,z)= E (J-z\x(z\x)P M (m)P w (w), 

m.w. 

v=g AB §( c > m , w ) 



(47) 



|ImS||ImB| 

0, otherwise 



if x £ X, m £ Imfi, w £ ImB 



(48) 



where the summation equals zero when x ^ X and the second equality comes from the fact that if x £ X then 
there is a unique pair (m, w) such that x = g AB g(c,m,w). From Lemma 15, we have 

Hz\x{*\x)<2- n W z W-<*>*™^ 



(49) 



for x £ 7x l7 and z £ Tz\x,j(x). Then the joint entropy H(X n , Z n ) is given by 

1 



H(X\Z n )> E P xz(^^og 

xeTx n zeT Z \x,-,(x) 



P XiZ (x,z) 



^ E E P xz (x,z)[n[H(Z\X)-Cz\ x (lh)]+log\ImB\\ImB\ 

xeTx.y z GTz|x, 7 (S) 



> ti l - 8' 



tf(Z|X) + -log|Im£||IrnS|-a|*(7l7) 



> n[ff + /(X; y)] - log J^ T + \, - n [8' log |*||Z| + Cz\x{l\l) + £f 



(50) 



for sufficiently large n, where the second inequality comes from (48) and (49), and the third inequality comes 
from Lemma 4. Then we have 

J(M; Z n ) = H(M) + H(Z n ) - H(Z n , M) 

= H(M) + H(Z n ) - H(Z n , M, g ABB (c, M, W)) + H{g ABB {c, M, W)\Z n , M) 
= H(M) + H(Z n ) - H(Z n , g ABB (c, M, W)) + H(g ABB (c, M, W)\Z n , M) 
< H(M) + H(Z n ) - H(Z n , g ABB (c, M, W)) + h(5') + nS' log \X\ 



< n[I(X; Y) - I(X; Z)\ + H(Z n ) - n[H(Z\X) + I(X; Y)] + log 

+ n [8' log \X\\Z\ + Cz\Alh) + eg] + W) + n8' log \X\ 

< n8 



ImSIIImBI 



(51) 



for sufficiently large n, where the third equality comes from the fact that Bg(c, M, W) = M, the first inequality 
comes from (47), the second inequality comes from (50), and we choose suitable eg, 7, J' > to satisfy the 
last inequality. From this inequality we have (20). 
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B. Proof of Theorem 2 

We use the following lemmas, which are proved in Appendix. 

Lemma 5: If I(X;Z) < R, then for all e > there is a random variable Z taking values in Z = X x Z 
and a function / such that 

I(X;Z) = R + e 
Z = f(Z). 

Lemma 6: Let g A B(c,m\z) be defined as 

g AB (c,m\z) = arg min H(x'\z). 

x'£CAB(c,m) 

Then, for all 5' > 0, all sufficiently small 7 > 0, and sufficiently large n, there are functions (sparse matrices) 
A e A, B e B, B e B, and a vector c e Im.4 such that 



(m,w,y,z) : 



or 5 ^ Tg| Xi7 (5 Asg (c, m,w)) 
or 9 A BB^ rn ^ w ) ^9A(c\y) 
or 9ABB( c ' m > w ) ¥=9AB(c,m\z) ^ 



\ 



<5' 



(52) 



for any A*y^|x satisfying 



R A + R B + R S < H(X) (53) 
^ > H{X\Y) (54) 
R A + R B > H(X\Z). (55) 

Now we prove Theorem 2. The inequality (22) is shown similarly to the proof of (19). 
In the following we prove (23). From Lemma 5, there is Z e Z such that 

I{X;Z) = Rg + e, (56) 
where e > is specified later. From Lemma 6 and Fano's inequality, we have 

H(g ABS (c, M, W)\Z n , M) < h(S') + nS' log \X\ (57) 
for all 5' > and sufficiently large n, where h is defined by (8). Similarly to the proof of (50), we have 
H(Z n ,g ABS (c,M,W))>n[l-6'] H(Z\X) + X - log |ImB||Im£| - Cj^frfr) 

> + log |ImH||ImH| - n [<5'log \X\\Z\ + Cz lx h\l) 

>n[H{Z\X) + R B + R § ]-n[8'\og\X\\Z\+^ x { 1 \ 1 )\ , (58) 
where the second inequality comes from the fact that R B + Rg < H(X) < log \X\. Then we have 
I(M;Z n ) = I(M;f(Z 1 ),...,f(Z n )) 
< I(M;Z n ) 

= H(M) + H(Z n ) - H(Z n ,M,g ABS (c, M, W)) + H(g ABB (c, M, W)\Z n ,M) 
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< H(M) + H{Z n ) - H(Z n ,g ABS (c, M, W)) + h(5') + nS' log \X\ 

< nR B + H(Z n ) - n[H(Z\X) + R B + %] + n ]s' log \X\\Z\ + Cz\Al\l)] + h(5') + nS' log \X\ 

< n[I(X; Z) -R g ]+n \s' log \X\\Z\ + Cz\ X (lh)} + W) + nS' log \X\ 



< n 

< nS 



£ + 5'log|A'| 2 |Z|+C2|^(7l7) +H5') 



(59) 



where the second inequality comes from (57) and M — Bg AB g(c,M,W), the third inequality comes from 
(58), the fifth inequality comes from (56), and we choose a suitable 7 > 0, a suitable e > 0, and a suitable 
5' > to satisfy the last inequality. From this inequality, we have (23). 

C. Proof of Theorem 3 

We use the following lemma which is proved in Appendix. 
Lemma 7: Let gAB(c, m\z) be defined as 

gAB(c, m\z) = arg max [i x \ z (x'\z). 

x' eCAB(c,m) 

Then, for any 5' > 0, and all sufficiently large n, there are functions (sparse matrices) A e A and B e B such 
that 

(60) 



PXYZ 



V 



g A (Ax\y) 

(x.y.z) : ) I rf'. 

or g AB (Ax,Bx\z) ^ x 



Now we prove Theorem 3. 

First, we prove (34). Since g A {Ax\y) = x implies Bg A (Ax\y) = Bx, then the inequality (34) comes 
immediately from Lemma 7 by letting 5' < 8. 

Next, we prove (35). From Lemma 7 and Fano's inequality, we have 

H(X n \Z n , C, M) < h(S') + nS' log \X\ 

for all S > and all sufficiently large n, where h is defined by (8). This implies that 

H(Z n , C, M) > H(X n , Z n , C, M) - h{8') - nS' log \X\ 

= H(X , \Z n )-h(S r )-nS'log\X\ (61) 

for all S > and all sufficiently large n, where the equality comes from the definitions (28) and (29) of C and 
M. Then we have 

I(M;Z n ,C) = H(Z n ,C) + H(M) - H(Z n ,C,M) 

< H(Z n ) + H{C) + H(M) - H(Z n , C, M) 

< H(Z n ) + H{C) + H(M) - H(X n , Z n ) + h(S') + nS' log \X\ 

< H(Z n ) + n[H(X\Y) + e A ] + n[H(X\Z) - H(X\Y)] - H(X n , Z n ) + h(S') + nS' log \X\ 
= ns A + h(5') + n6'log\X\ 

< nS, (62) 
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where the second inequality comes from (61), the third inequality comes from the definitions (28) and (29) 
of C and M, and we choose a suitable e A > and a suitable S' > to satisfy the last inequality. From this 
inequality we have (35). 

Finally, we prove (33). We have 

H(M) = H(M) + H(Z n , C) - H(Z n , C) 

> H(Z n , C, M) - H(Z n ) - H(C) 

> H(X n , Z n ) - h(S') - nS' log \X\ - H{Z n ) - n[H(X\Y) + e A ] 
= n[I(X; Z) - I{X; Y)} - ne A - h(S') - nd' log \X\ 

>n[I(X;Z)-I(X;Y)]-n6, (63) 

where the second inequality comes from (61), and we choose a suitable e A > and a suitable 6' > to satisfy 
the last inequality. From this inequality we have (33). 

D. Proof of Theorem 4 

We use the following lemmas which are proved in Appendix. 

Lemma 8: If H(X\Z) > R, then for all e > there is a random variable Z taking values in Z = X x Z 
and a function / such that 

H(X\Z) =R-e 

z = f(z). 

■ 

Lemma 9: Let 3ab(c, m\z) be defined as 

g A B(c,m\z) = arg min H(x'\z). 

x' £CAB(c,m) 

Then, for any 5' > 0, and all sufficiently large n, there are functions (sparse matrices) A e A and B e B such 
that 

/ 



PXYZ 

for any H XY z satisfying 



V 



g A (Ax\y) ^ x 
(x,y,z) : 

or g AB (Ax,Bx\z) ^ x 



> < 6' (64) 



R A > H(X\Y) (65) 
R A + R B > H(X\Z). (66) 

■ 

Now we prove Theorem 4. In the following, we prove (38) and (36). The proof of (37) is similar to that of 
(64). 

First, from Lemma 8 and (66), there isZeZ such that 

H(X\Z) =R A + R B -e, (67) 
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where e > is specified later. From Lemma 9 and Fano's inequality, we have 

H(X n \Z n ,C,M) < h(S') + n6' log \X\ 
for all 6 > and all sufficiently large n, where h is defined by (8). This implies that 

H(Z n , C, M) > H(X n , Z n ) - h{5') - n5' log \X\. (68) 
Next, we prove (38). We have 

I(M; Z n , C) = J(M; f{Z x ), f(Z n ),C) 

< I(M;Z n ,C) 

< H(Z n ) + H(C) + H(M) - H(Z n , C, M) 

< H(Z n ) + H(C) + H(M) - H(X n , Z n ) + h(S') + nS' log \X\ 

< H(Z n ) + nR A + nR B - H(X n , Z n ) + h(6') + nS' log \X\ 
= ne + h{5') + nS' \og\X\ 

< nS, (69) 

where the third inequality comes from (68), the fourth inequality comes from the definitions (28) and (29) of 
C and M, the second equality comes from (67) and we choose a suitable e > and a suitable S' > to satisfy 
the last inequality. From this inequality, we have (38). 
Finally, we prove (36). We have 

H{M) > H(Z n , C, M) - H{Z n ) - H{C) 

> H(X n , Z n ) - h(S') - nS' log \X\ - H(Z n ) - nR A 
= nR B -ns- h(S') - nS' log \X\, 

> nR B - nS, (70) 

where the second inequality comes from (68), the equality comes from (67), and we choose a suitable e > 
and a suitable 5' > to satisfy the last inequality. From this inequality, we have (36). 

X. Conclusion 

The constructions of codes for the wiretap channel and secret key agreement from correlated source outputs 
were presented. The optimality, reliability, and security of the codes were proved and the universal reliability 
and security were also proved. The proof of the theorems is based on the notion of a hash property for an 
ensemble of functions. Since an ensemble of sparse matrices has a hash property, we can construct codes by 
using sparse matrices and practical encoding and decoding methods are expected to be effective. We believe that 
our construction can be applied to a quantum channel to realize a quantum cryptography. However, it should 
be noted that the security criteria should be revised to the quantum version. 
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Appendix 

A. Proof of Lemmas 

Before the proof of Lemmas 4 and 6, we prepare the following lemmas 
Lemma 10 ([18, Lemma 8]): For any A and u e U n , 

PC ({c : Am = c}) = ^p c (c) X {Au = c) = — ^ 

and for any u eU n 



\ImA\ 



E AC [x(Au = c)] = ^fUc(Ac)x(A« = c) = — ^— . 



Lemma 11 ([18, Lemma 3]): If (^4,p^) satisfies (H4), then 

/ 



(Ac): 



e?nC A (c) ^ 



|Im^| 2 |W| 



it e C A (c) 
for all G <zU n and all u $ Q. 

Lemma 12: Assume that eg > ea > 0. For (3 A satisfying lim„_ ) . 00 /3a (n) = and any 7 > 0, there is a 
sequence k = {K(n)}^ =1 and T C 7c/ C 7x, 7 such that 



lim = 00 

n— t- 00 

lim «;(n)/3 J 4(n) = 

n— >oo 

Um log^n) = Q 

n— t-oo 77, 



and 



K < 



7 > H^x) 

iri 



|Im^||ImB||ImB| 
for all sufficiently large n, where U is defined as 



< 2k 



U = aigimnD^wW/ix)- 



(71) 
(72) 
(73) 

(74) 
(75) 

(76) 



In the following, k denotes re(n). 
Proof: Let 



«(n) = < 



if ^(n)=o(n-«),^>0 



, —, — , otherwise 



(77) 



for every n. It is clear that k satisfies (71) and (72). It is also clear that k satisfies (73) when Pa (n) = o{n 
£ > 0. If 0a(ti) is not o (n~^), there is k' > such that /3 A (n)n^ > k' and 

log«(n) = lQ gfflW 
n 2n 

2n 

£ log n — log k' 
2n 



< 



(78) 
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for all sufficiently large n. This implies that n satisfies (73). The inequality (74) comes from Lemma 21. From 
Lemma 21 and eg > ea > 0, we have 

R A + R B + R S + < H(U) - X x < H{X) (79) 

for all sufficiently large n. Then we have 

\Tx„\ > \Tu\ 

= K|Im^||ImS||IniB| (80) 

for all sufficiently large n, where the first inequality comes from (74). This implies that there is T C Tu C 73c, 7 
such that 

ITI 

k < — < 2k (81) 

|Im^||ImB||ImB| 

for all sufficiently large n. ■ 
Remark 3: It should be noted that we can let £ be arbitrarily large in (77) when /3a(ti) vanishes exponentially 
fast. This parameter £ affects the upper bound of (46) and (52). ■ 

B. Proof of Lemma 4 

In the following, we assume that e^, eg, and 7 > satisfy 

eg > e A > max {( x]y (2 1 \2'y),( xl z{2- r \2'y)} . (82) 

Let k = {«(n)}^° =1 be a sequence satisfying (71)— (73), and U be defined by (76). Then (74) is satisfied for 
all 7 > and all sufficiently large n. From Lemma 12, there is 7" C Tu C Tx, 7 satisfying (75). 

Let a; an input of the channel, and y and z be the channel outputs of the receiver and the eavesdropper, 
respectively. Let m be a message and w be a random sequence. We define 

• 9 abb (c,m,w) eT C Tx,*, (Wl) 

• y e TY\x,~f(g A BB( c ' m ' w )) ( W2 ) 

• 2 g T Z \x n {g A BB < y c ' rn i w )) ( W3 ) 
•Su(c|y) = g AB g(c,m,w) (W4) 

• 0AB(c,m|z) = g AB g(c,m,w). (W5) 
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Then the left hand side of (46) is upper bounded by 



Pmwyz 



(m,w,y,z) : 



or z (f. Tz\x n (g AB B( c > m ' w )) 
or 9 A BB( c > m > w ) ¥=9A(c\y) 
\{ or ffABg (c,m,«;) ^g AB (c,m\z)) J 

< PMWYZ (Si) + PMWYZ (Si C\S%) +PMWYz(Sl nSg) + PMWYZ (Si n s 2 n s%) 

+ pmwyz(Si ns 3 n si), 



\ 



(83) 



where 



Si = {(m,w,y,z) : (Wi)} . 
First, we evaluate E AB g c [pmwyz{SI)\. From Lemma 2 and (81), we have 

E ABBC \PMWYZ(SI)\ = PabbcMW ({( A > B > C > m ' «>) : £Ubb( C ' m > W ) t T }) 

^ Pabbcmw ({(A B > c, m, u;) : T n C AB g(c, m, w) = 0}) 
| |W||I m g||Img|[/3 AB g + l] 

- 01 ABB 1 + 



ABB 



+ 1 



< 



5' 



(84) 



for all <5' > and sufficiently large n, where the last inequality comes from (71) and the properties (H2) and 
(H3) of an ensemble (Ax B x B,p AB g). 

Next, we evaluate E AB g c [pmwyz(Si H S%)] and E AB g c [pmwyz(Si n Sf )]. From Lemma 16, we have 

e abSc \pmwyz(Si n 5 2 c )] < y (85) 

E ABBC [PMWYz(Sl n 5 3 C )] < 5 - (86) 

for all 5' > and sufficiently large n. 

Next, we evaluate E AB g c \pmwyz(Si n 5 2 n and E AB g c \pmwyz(S\ n <S 3 n 5g)]. In the following, 
we assume that 

• x eT c 73c 7 
•ye 7V|x, 7 (aO 

•5a(c|i/) 7^ a;- 

From Lemma 14, we have (x,y) £ Txy.2^ and x e 73c|y,27(y)- Then there is x' € C^(c) such that x' ^ x 
and 

Vx\y(x'\v) > Hx\v(x\y) 

> 2 -n[H(X|y)+CA-|y(27|27)] ; 
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where the second inequality comes from Lemma 16. This implies that [Q{y) \ {x}] n Ca(c) 7^ 0, where 

G(y) = {x> : Hx\y{x'\y) > 2-[W)+W(2 7 |2 7 )] } . 

Then we have 

Eabbc [pmwyz{S\ n5 2 n SI)] 



< E 



ABBCMW 



E x(9 AB g(C,M,W) =x) J2 VY\x(y\x)x(9A(C\y) ? x) 

S/e7V|x, 7 (aO 



< E 



ABBCMW 



Y J X{Ax = C)x{Bx = M) X (Bx = W) ]T ^ x {y\x) x {g A {C\y) ^ x) 
•eeT yeT Y \x,~,(x) 



= E E VY\x(y\x)E AC [x(9A(C\y) ^ x) x {Ax = C)E bSmw 

xgT y£T Y \x,~,(x) 



X (Bx = M) X (Bx = W) 



, T * g. E E /^|*(wl*)*Uc [x(ffA(C|y) # z) X (As = C)} 
\ImB\\lmB\ xGTy£TYiXri(x) 



x£Ty£TY\x n (x) 



(Ac): 



[g(?/)\W]nc A (c)^ 
x e C A (c) 



" IT KMT n\ ^ ^2 »Y\X(V\X) 

|ImB||ImB| 9eTy€T (a) 



2 n[H(X\Y)+^y(2 1 \2 1 )] aA p 



< 



< 



2 n[H(X\Y)+C xl y(2j\2y)] aA 

Ml 



\lmA\ 2 



+ 



\lmA\ 



E — - 

^M||IrnB||Irn£| 



Ml 



Pa 
+ 2k@ a 



5' 
< — 
~ 5 



(87) 



for all 8' > and sufficiently large n, where the second inequality comes from Lemma 10, the fourth inequality 
comes from Lemma 1 1 and the fact that 

\Q{y)\ < 2™[ ff (^l^)+CA-|y(27|27)] ; 

the sixth inequality comes from (81), and the last inequality comes from (72), (82) and the properties (H1)-(H3) 
of an ensemble (A.,p A ). Similarly, we have 

2K\X\ lA+lB 2~ n t £A ~( x \ z ( 2 ~>\ 2 ~>^a A B 
e abbc \Pmwyz(Si n S 3 n 5g)] < , T AIIT ___„, 1- 2k/3ab 



|Im^||ImS| 



8' 
< — 
~ 5 



(88) 



for all 8' > and sufficiently large n. 

Finally, from (83)— (88), we have the fact that for all 8' > and sufficiently large n there are A e A, B € B, 
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B e B, and c e Im.4 such that 



Pmwz 



(m,w,y,z) : 



or z ^ T z \x, 1 {9 AB b{ c ^ w )) 
or 9 A BB^ rn ^ w ) + 9a{c\v) 
or 9 A bb( c ' m ' ™) ^ 5ab(c, m|z) _ 



\ 



<<y'. 



C. Proof of Lemma 5 

The following proof is based on [16, Lemma 1]. If there is a random variable X' taking values in X such 
that 

H(X\X',Z) = R' (89) 
for given (X, Z) and < R' < H(X\Z) the lemma is proved by letting 

R' = H{X) -R-e< H{X\Z) 
Z = (X',Z) 
f(z) = z for z = (x', z) 

because 

I(X;Z) = H(X) - H{X\Z) 
= H{X) - R' 

= R + e. (90) 

The following proves the existence of X' satisfying (89). It is clear that < H(X\X', Z) < H(X\Z) for any 
(X, X 1 , Z), H(X\X', Z) = H{X\Z) when X' is independent of (X, Z), and H(X\X', Z) = when X' = X. 
Since H(X\X',Z) is a continuous function of the conditional distribution px'\xz, we nave the existence of 
Px'\xz satisfying H(X\X', Z) = R' from the intermediate value theorem, where pxx'z is given by 

Pxx'z(x,x',v) = /j, X z(x,z)p x >\xz(x'\x,z) 

for (x, x', z) e X x X x Z. ■ 



D. Proof of Lemma 6 

Let k = {^(n)}^]^ be a sequence satisfying (71)-(73). Let U be defined by (76). Then (74) is satisfied for 
all 7 > and all sufficiently large n. From Lemma 12, there is T C 7c/ C 73f, 7 satisfying (75). 
We define 



• 3ab(c, m,w) eT cTu c Tx n 
•ztTz lXn (g ABS (c,m,w)) 
•9A{c\y) =g ABS (c,m,w) 



(UW1) 
(UW2) 
(UW3) 
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. 9ab{c, m\z) = ff AB g(c, m, w), 



(UW4) 



where we assume that n is large enough to satisfy Tg| X 7 (x) ^ for all x e 7x l7 . Then the left hand side of 
(22) is upper bounded by 



Pmwyz 



(m,w,y,z) : 



g AB {c,m,w) <£ 7x, 7 

ov 9ABB( c > m > w ) ^9A(c\y) 
V [ or <?ABs( c ' m >™) ^9AB(c,m\z) J J 

^ Pmwyz( S i) + Pmwyz( S i n 5 2 c ) +Pmwz( 5 i n 5 s) +Pmwyz( S i n 5 4 c ), 



(91) 



where 



Si = {{m,w,y,z):(\JWi)}. 
First, we evaluate E AB g c [p MWY z(^)\ • Similarly to the proof of (84), we have 

E AB bc [PuwYziSD] < a ABS 1 + PabS + \ 
Next, we evaluate E ABB ~ C [p MWY z(St n ^2)] • From Lemma 16, we have 

Eabbc [PMWYZ&^St)] < 2-^-^1 
Next, we evaluate E abSc [p MWY z{S\ n S%)] . Let 

G(y) = {x 1 : H(x'\y) < H(U\V)} 



(92) 



(93) 



and assume that (x,y) € Tuv- Then we have 



E AC [x(Ax = C)x(g A (C\y) ^ x)] = vac 



PA < 



Ax = c 
(A,c) : 3x' ^ x s.t. 
\ [ H(x'\y) < H(x\y) and Ax' = c 

3x' ^ x s.t. 1 ^ 

H(x'\y) < H(x\y) and Ax' = Ax} J 
3x' ^x s.t. F(x'|y) <i2"(t/|\0 



Ml 



PA < 



p c ({c : Ax = c}) 
\ 



A : 



and Ax' = Ax 



-fcbj maX l £ PA ({A : Ax = Ax'}) , 1 > 
1 1 {x'eS(y)\{x} J 

^ - max { \hnA\ +PaA ) 

mzx{aA,l}2- n ^- H(u ^ + - x ^ + 13 A ] 



\lmA\ 
1 

|InU| 
1 

Ml 



(94) 
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where | • | + is defined by (9), the third equality comes from Lemma 10, and the second inequality comes from 
Lemma 20 and the property (H4) of (A, pa)- Let 

F Ylx (R) = mm [£>(JV|c/|l/^|xM + \R~ H(U\V)\+] , 
where V\U denotes the conditional type given type U. Then we have 

E ABBC [PMWYz( S l S D] 



< E 



ABBCMW 



= E 



ABBCMW 



< E 



ABBCMW 



^2^2^Y\x(y\x)x(9AB(c,m,w) =x) x {gA{c\y) ^ x) 
.xeT v 



EE E HY\x(y\x)x(9A B (C, M) = x)x(g A (C\y) ? x) 
xeTv\Uyer vlu (x) 



EE E »Y l x(y\x) X (Ax = C)x(Bx = M)x(Bx = W)x(9A(C\y)^x) 

xeTV\UyeT viu (x) 



x (Bx = M) x (Bx = W) 



< 



EE E ^Y\x{y\x)E AC [x{Ax = C)x{9A{C\y)^x)]E BSMW 

xeTV\UyeT v \u(v) 

, T .mt^mt g, EE E /*r,x(y|*)[nu«{ax I l}2-[l^-^l + -^+^ 



— -E 



< 



— -E 



max {a A , 1} ]T £ ^ |JC ( V |*)2-"[I^-'W)I + -A«>] + ^ 
V\U yeT v \u{x) 



max {a A , 1} ]T 2- n ^l""' il 'l*l ,v ) + l^- H ^l v )l + - A ^] + A 



< 



\T\ 



|Im^||ImB||ImB| 



m0,x{a A ,l}^ nlFYlx(RA) ~ 2Xxy] + Pa 



< 2k 



m&x{a A ,l}2- n ^ Y ^( R ^- 2X ^ +/3 A 



(95) 



where the third inequality comes from Lemma 10 and (94), the fourth inequality comes from Lemmas 13 
and 19, the fifth inequality comes from the definition of F Y \x an d Lemma 18, and the last inequality comes 
from (81). Similarly, we have 



E ABC [PMWYzfr S i)] ^ 2K [ max l«AB, 1} 2- n[F z\x (RA+RB) - 2X *z ] + fi A B 



(96) 



where 



F z\x( R ) = mjn [D(u vllu \\^ lx \uu) + \R- H(U\V 
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From (91)-(93), (95) and (96), we have 



E 



ABBC 



Pmwyz 



(m,w,y,z) : 



+ 1 



g AB (c,m,w) Tx n 

0r *£ r z\X,^9ABB^ m ^ W )) 



^ a ABB - 1 + ABB + 2- n ^- x *^ + 2k max {c^, 1} 2""[ inf *V|x(«A)-2A* y ] + ^ 



2k 



max{a A B,l}2" Il[infFg ^ ( ^ +iiBK2A ^ 1 + £ 



AS 



where the infimum is taken over all satisfying (53)-(55). This implies that there are A e A, B e B, 

B £ B, and c e Im.4 such that 



Pmwyz 



(m,w,y,z) : 



g A B(c,m,w) £ Tx,~t 
OTZ^Tz lXtl (g AB g(c,m,w)) 
0V 9 ABB ( c ^ m ^ w ) ¥=9A(c\y) 
or 9 A BB^ m ^ w ) ^ 9AB(c,m\z)^ 



\ 



(97) 



^ "ass - 1 + + 2-™^- A ^l + 2k max {cm, 1} 2-"I inf ■*V|x(iU)-2A* y ] + ^ 



+ 2k 



max{a AB , 1} 2"" [inf *S|*(«a+**)-2\,.*] + 



'AS 



Since 



inf. F Ylx {R A )>0 

H(Y\X)<R A 



inf. F^^ + fls) >0, 

ff(Z|J>0<.fiU+,Rs 



then the right hand side of (97) goes to zero as n — > oo by assuming (71)-(73) and the properties (H2) and 
(H3) of ensembles (A,p A ), (A x and (Ax B x B,p AB g). ■ 

£. Proof of Lemma 7 

In the following, we assume that e A > and 7 > satisfy 

£4 > max {(x\yh\l), Cx\z{l\l)} • (98) 
Let x, y, and z be outputs of correlated sources. We define 

• (x,y,z) eTxYZ^ (SKA1) 

• g A (Ax\y)=x (SKA2) 

• Ba;|z) = x. (SKA3) 
Then the left hand side of (34) is upper bounded by 

g AB (Ax\y) i= x 



PXYZ < 



(x,y,z) : 



or 9 ABB ( Ax , Bx \ z ) ^ x 



< vxYzisi) + pxyz{Si n s c 2 ) + vxyz(s x n s 3 c ), 



(99) 
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where 

Si = {(x,y,z):(SKAi)}. 
First, we evaluate Eab [pxyz{S{)\. From (81), we have 

e ab \pxyz(SD] < 2- n ^- x *y^ 

5' 

< j (100) 

for all 6' > and sufficiently large n. 

Next, we evaluate Eab [pxyz{S\ n Sf )] and Eab [pxyz{Si n S3)]. From Lemma 14, we have (x, y) G 
Txy,j and a; e 7x|y j7 (y). Then there is x' e Ca(^4:e) such that x' ^ x and 

> 2 -n[ff(X|y)+C^|y(7|7)] ) (101) 

where the second inequality comes from Lemma 16. This implies that [Q{y) \ {x}] n Ca(Ax) 7^ 0, where 

G(y) = [x 1 : (M X \ Y (x'\y) > 2~W x W+<*»™rt} . 

Then we have 

Eab Wyz{Si n S 2 C )] < ^ y, z)^ ({A : [Q(y) \ {x}} n Ca(Ab) ^ 0}) 

(cc,i/,z)e7xyz,^ 

'|£(y)|o4 



|W| + Ai 

2 n[H(X|y)+C A -| J ,(7l7)] Q , A 

|Im4| 



< VxYz(x,y,z) 

< VxYz(x,y,z) 

~ \hnA\ 

\X\ l *a A 2 _ nlsA _ c> , ivHh)] | 
|Iia4| 

< y (102) 

for all (5' > and sufficiently large n by taking an appropriate 7 > 0, where the second inequality comes from 
Lemma 1 and the third inequality comes from the fact that 

\G(y)\ < 2" [H(x|y)+<;A -iy (7l7)1 , 

the fifth inequality comes from the definition of l A , and the last inequality comes from (98) and the properties 
(H1)-(H3) of an ensemble (A.,p A )- Similarly, we have 

Eab [pxyz^ n S|)] < ^^ 2^^*™1 + Pab 
5' 

< j (103) 

for all 5' > and sufficiently large n. 
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Finally, from (99)-(103), we have the fact that for all 5' > and sufficiently large n there are A G A and 
B e B such that 

\ 



PXYZ < 



(x,y,z) : 



g A {Ax\y) ^ x 



< 6' 



or g AB (Ax, Bx\z) ^ x J J 

for all 5' > and sufficiently large n. ■ 
Remark 4: It should be noted that the property (H2) of ensembles (A., p A ) and (Ax B, p AB ) can be replaced 

by 

lim = i 

n^oo n 

lim l0g ^ (n) = 1, 

n^oo n 

respectively. In particular, there are expurgated ensembles (A.,p A ) and (B,p B ) of sparse matrices that have 
an (a J 4,0)-hash property, where the condition (H2) for a A and a AB is replaced by the above respective 
conditions (see [2]). 

F. Proof of Lemma 8 

It has already been proved in the proof of Lemma 5 that there is a random variable X' taking values in X 
such that 

H(X\X',Z) = R' 

for given (X, Z) and < R' < H{X\Z). The lemma is proved by letting 

R' = R-e 
Z=(X',Z) 
f(z) = z for z — (x' , z). 



G. Proof of Lemma 9 

Let x, y, z be outputs of the correlated sources. We define 

• g A (Ax\y) = x 

• g AB (Ax,Ax\z) = x. 
Then the left hand side of (64) is upper bounded by 



P 



XYZ 



g AB {Ax\y) ^ x 

(x,y,z): } | <p X Yz(St)+pxYz(cS$) 

or 9 ABB (Ax,Bx\z) ^ x 



(USKA1) 
(USKA2) 

(104) 



where 



Si = {(x,y,z) :(USKAi)}. 



April 12, 2010 



DRAFT 



31 



In the following, we evaluate Eab [Pxyz(^i)] an ^ Eab [PxYZ^i)]- Let UV be the type of sequence 
(x,y) G X n xy n and V\U be the conditional type given type U. In the following, we assume that (x, y) € Tuv- 
If gA(Ax\y) 7^ x, then there is x' € Ca(Ax) such that x' ^ x and 

ff(x'ly) < ff(x|y) < -ff(^l^). 

This implies that [G(y) \ {x}} n Ca(Ae) 7^ 0, where 

Q(y) = {x' : H(x'\y) < H(U\V)} . 

Then we have 

E AB [ X {9A{Ax\y) ± x)\ < PA ({A : [Q{y) \ {x}} n C a (Ab) + 0}) 

f |g(»)l«A , fl J 

- max {^4T + ^' 7 

r^mum+^OA J 
^ maX l M 

< max J l) 2 -»l\**-BV\V)\ + -\xy] + (105) 

where | • | + is defined by (9), the second inequality comes from Lemma 1, and the third inequality comes from 
Lemma 20. Let 



F XY (R) ee min [D(v xv \\» XY ) + \R- H(U\V)\+] . 



Then we have 



Eab [Hxyz( S i)] = E E y)- B AB [x(9A(Ax\y) ^ x)} 

UV {x,y)eTuv 

\l 



<E E ^(x,y)[max{^|£^,l}2-"[l^-^l^l + -^ + ^ 
t/V (x,y)eTuv L M m I J 

< m ax(^J-^- i\\^2-"[ D ^H' uxy ) + l K - A - Ef ( c/ l v )l + - A ^ 

< maxJ 1 ^ 1 ^"- 4 , ll 2-»[^W^)-2A^] + ^ (1Q6) 



|Iia4| 

where the first inequality comes from (105), the second inequality comes from Lemmas 13 and 19, and the 
last inequality comes from Lemma 18 and the definition of Fxy- Similarly, we have 

Eab [ Pxy ^)\ < -ax{^|^,l}2-^(^)- 2 ^ + fa, (107) 



where 



F X ~ Z {R) ee min [D{v x ~ z \\^) + \R- H(U\V')\+] 



Finally, from (104), (106), and (107), we have 

g A B(Ax\y) ^ x 



Eai 



Pxyz \ 



(x,y,z) : 

or 9 A bb( Ax > Bx \ z ) ^ x 



< max 
+ Pa + Pab 



I \ X \ Ua A A 2 -nm Fxv (R A )-2X xv ] , ma: . ( \X\ lA+lB <XAB ^ 1 2 _ n[inf F x - (R A +R B )-2\ xS ] 

\ \ImA\ ' J \ |Inx4||InuB| ' J 
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where the infimum is taken over all H XY z satisfying (65) and (66). This implies that there are A e A and 
B e B such that 
/ 

Pxyz l(x,y,z): 

A ° r 9abb( Ax > Bx \ z ) ^ x 

\U~. > f \X\ lA+lB a AB 



g A B(Ax\y) ^ x 



< max 
+ Pa + Pab- 

Since 



f M AOiA A 2 -n[inlF XY (R A )-2X X y] , ( 1*1 A + Ba AB A 2 -„[in{ F xg (R A +R B )-2X xS ] 

\ \lmA\ ' J I |Im-4||ImB| ' J 



(108) 



inf F XY (R A ) > 

ff(X|y)<iU 



inf 

ff(X|Z)<_R A +K e 



F x s(i^ + i? 8 ) >0, 



then the right hand side of (108) goes to zero as n — > oo by assuming the properties (H1)-(H3) of (A.,p A ) 
and (.4 x B,p AB ). ■ 



//. Method of Types 

We use the following lemmas for a set of typical sequences. 
Lemma 13 ([6, Lemma 2.6][18, Lemma 21]): 

1 



-log 



-log- 



1 



= H{v uv ) + D(u uv \\fj, uv ) 
= H(v u \ v \v v ) + D{v u \ v \\^u\ v \v v ). 



n fJ-u\v{u\v) 

Lemma 14 ([22, Theorem 2.5][18, Lemma 22]): If v € 7y, 7 and u <E Tu\v,i'( v )> men ( u > v ) € 7W, 7 + 7 '- 
If (u, v) e 7t/v,7> tnen M G 7c7, 7 and w e 7t/|y l7 (v). 

Lemma 75 (722, Theorem 2.7] [18, Lemma 24]): Let < 7 < 1/8. Then, 

1 



for all u e Tu,f> and 



-log L__ir(tf|l0 



< 



Cw|v(7'l7) 



for v e 7V,7 and u e 7u|v, 7 '(v), where Cw(t) and Cw|v(7'l7) are defined in (5) and (6), respectively. 
Lemma 16 ([22, Theorem 2.8][18, Lemma 25]): For any 7 > and v e V™, 

^ u \v{[Tu\v, 1 {v)] c \v)<2-^- x ^\ 

where and \uv are defined in (4). 

Lemma 17 ([22, Theorem 2.9][18, Lemma 26]): For any 7 > 0, 

1 



n 



log\T Un \-H(U) 



< mil), 



where 77^(7) is defined in (7). 
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Lemma 18 ([6, Lemma 2.2]): The number of different types of sequences in U n is fewer than [n + 1]' W L 
The number of conditional types of sequences in U n x V™ is fewer than [n + l]l w ll v l. 
Lemma 19 ([6, Lemma 2.3]): For a type U of a sequence in X n , 

2 n[H(U)-\ x ] < |^| < 2™ ff ( C/ ), 

where Ax is defined in (4). 

Lemma 20 ([19, Lemma 7] [16, Lemma 2]): For y G 7y, 

| {as' : H(x') < i?(C/)} | < 2 n l H M+**] 
| {as' : H(x'\y) < H(U\V)} \ < 2 n l H W v )+ x *y\ 

where Ax and \xy are defined in (4). 

Lemma 21 ([19, Lemma 7]): For any probability distribution [ix on X, 

\X\ 

mmd(vu,Lix) < — 
\X\ 

minD(tv||Mx) < 



" nminx W ( I )>owW 

2IA-I 



min|i2"(X) - < 



!7 nmin a; . jUX ( :i; ) >0/ ux(a;) 
where minimum is taken over all types U of the sequence in X n . 
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